04:00
2026-06-16
arxiv.org
large-language-models
Nemotron 3 Ultra: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning
NVIDIA released Nemotron 3 Ultra, a 550B-parameter hybrid Mamba-Transformer model with 55B active parameters, achieving up to 6x higher inference throughput than state-of-the-art LLMs while maintaininβ¦